33 research outputs found

    Finite difference methods fengshui: alignment through a mathematics of arrays

    Get PDF
    Numerous scientific-computational domains make use of array data. The core computing of the numerical methods and the algorithms involved is related to multi-dimensional array manipulation. Memory layout and the access patterns of that data are crucial to the optimal performance of the array-based computations. As we move towards exascale computing, writing portable code for efficient data parallel computations is increasingly requiring an abstract productive working environment. To that end, we present the design of a framework for optimizing scientific array-based computations, building a case study for a Partial Differential Equations solver. By embedding the Mathematics of Arrays formalism in the Magnolia programming language, we assemble a software stack capable of abstracting the continuous high-level application layer from the discrete formulation of the collective array-based numerical methods and algorithms and the final detailed low-level code. The case study lays the groundwork for achieving optimized memory layout and efficient computations while preserving a stable abstraction layer independent of underlying algorithms and changes in the architecture.Peer ReviewedPostprint (author's final draft

    Intermediate Code Generation for Portable Scalable, Compilers. Architecture Independent Data Parallelism: The Preliminaries

    Get PDF
    This paper introduces the goals of the Portable, Scalable, Architecture Independent (PSI) Compiler Project for Data Parallel Languages at the University of Missouri-Rolla. A goal of this project is to produce a subcompiler for data parallel scientific programming languages such as HPF(High Performance Fortran) where the input grammar is translated to a three-address code intermediate language. Ultimately we plan to integrate our work into automated synthesis systems for scientific programming because we feel that it should not be necessary to learn complicated programming techniques to use multiprocessor computers or networks of computers effectively. This paper shows how to compile a data parallel language to an arbitrary multiprocessor topology or network of CPUs given the number of processors, length of vector registers, and total number of components in an array assuming a message passing, distributed memory paradigm of send and receive. We emphasize that this paradigm is not only amenable to machines such as the CM5 and NCube but to LAN and WAN connected architectures. We do automatic program partitioning and mapping to processing elements of a multiprocessor architecture or distributed network of machines. No programmer intervention is required, hence, no errors will be introduced through data decomposition

    P3 problem and Magnolia language: Specializing array computations for emerging architectures

    Get PDF
    The problem of producing portable high-performance computing (HPC) software that is cheap to develop and maintain is called the P3 (performance, portability, productivity) problem. Good solutions to the P3 problem have been achieved when the performance profiles of the target machines have been similar. The variety of HPC architectures is, however, large and can be expected to grow larger. Software for HPC therefore needs to be highly adaptable, and there is a pressing need to provide developers with tools to produce software that can target machines with vastly different profiles. Multi-dimensional array manipulation constitutes a core component of numerous numerical methods, such as finite difference solvers of Partial Differential Equations (PDEs). The efficiency of these computations is tightly connected to traversing and distributing array data in a hardware-friendly way. The Mathematics of Arrays (MoA) allows for formally reasoning about array computations and enables systematic transformations of array-based programs, e.g., to use data layouts that fit to a specific architecture. This paper presents a programming methodology aimed for tackling the P3 problem in domains that are well-explored using Magnolia, a language designed to embody generic programming. The Magnolia programmer can restrict the semantic properties of abstract generic types and operations by defining so-called axioms. Axioms can be used to produce tests for concrete implementations of specifications, for formal verification, or to perform semantics-preserving program transformations. We leverage Magnolia's semantic specification facilities to extend the Magnolia compiler with a term rewriting system. We implement MoA's transformation rules in Magnolia, and demonstrate through a case study on a finite difference solver of PDEs how our rewriting system allows exploring the space of possible optimizations.publishedVersio

    A Reduction Semantics for Array Expressions: The PSI Compiler

    No full text
    High Performance Computing is not only concerned with the performance of a program on a very fast processor, but also high performance on many fast processors perhaps of different speeds and topologies. Scientific computation and modeling often require real time visualizations on high resolution graphics terminals. The data structure most widely used in scientific computation is the array. Arithmetics operations and permutations are often used where its associated algebra transcends many scientific disciplines. In order to achieve high performance computing, the interfaces with which people interact with computers must be rapid and accurate with the ability to move across many platforms and adapt quickly to change. If a programming language supports an algebra useful to scientists, the programming language, and subsequent compiler, must optimize the resources of a particular machine or machines. That is, the compiled program must allocate the least amount of memory and do the least amo..
    corecore